Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File management improvements #24

Merged
merged 22 commits into from
Jan 21, 2025
Merged

File management improvements #24

merged 22 commits into from
Jan 21, 2025

Conversation

MehmedGIT
Copy link
Collaborator

@MehmedGIT MehmedGIT commented Dec 20, 2024

What is in this PR:

  1. Improve the file management in Operandi:
  • Introduce an extra download worker with a new download queue. The job status worker pushes download requests to the download queue if the status of a job becomes SUCCESS. Unlike the status queue, the download queue is persistent and survives Operandi service restarts. Hence, downloading HPC results (workflow jobs/workspace results) would not be interrupted.
  • Refactor all worker classes in the Operandi Broker to remove duplicate code and simplify the job status worker. The downloaded HPC results of succeeded jobs are now properly deleted from the HPC. The failed jobs for the current time being are still preserved only in the HPC and not downloaded (still to be decided how to handle files of failed jobs properly)
  • Zips to be transferred/downloaded to the HPC environment are now properly deleted from the Operandi Server local storage after being transferred to/from the HPC. Previously, some zips were not deleted under certain circumstances (e.g. exceptions due to wrong zip format or failing to unzip).
  • Set the logging level of some messages to DEBUG of the OtoN converter to prevent extensive logging flooding
  • Reduce log levels of some 3rd party libraries (pika, paramiko, ocrd-core) to WARNING to prevent extensive logging flooding
  • Other small improvements and fixes
  1. Add a flag for the admin batch endpoints to let the admins hide/show deleted resources when requesting resources. Regular users can see only non-deleted resources. Usually when a resource is deleted, the deletion happens locally but the database still keeps all the metadata of that record. The deleted resources in the DB are identified with the deleted field of each resource.

@MehmedGIT MehmedGIT marked this pull request as ready for review January 21, 2025 13:48
@MehmedGIT MehmedGIT merged commit 52aa423 into main Jan 21, 2025
14 checks passed
@MehmedGIT MehmedGIT deleted the file-management-improvements branch January 22, 2025 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant